Skip to content

feat(recipes): recipes-as-content registry — bundled .md + project-local .codemap/recipes/#37

Merged
SutuSebastian merged 14 commits into
mainfrom
feat/recipes-content-registry
May 2, 2026
Merged

feat(recipes): recipes-as-content registry — bundled .md + project-local .codemap/recipes/#37
SutuSebastian merged 14 commits into
mainfrom
feat/recipes-content-registry

Conversation

@SutuSebastian

@SutuSebastian SutuSebastian commented May 1, 2026

Copy link
Copy Markdown
Contributor

Summary

Implements the recipes-as-content registry — pairs every bundled recipe with a sibling .md description body AND lets projects ship their own recipes via .codemap/recipes/<id>.{sql,md}. Surfaces uniformly in --recipes-json, codemap query --recipe <id>, and the codemap://recipes / codemap://recipes/{id} MCP resources shipped in PR #35.

Status: Ready for review. All 6 grill questions settled (recorded inline in the now-deleted plan; canonical bits lifted into architecture.md, glossary.md, agent rule + skill per Rule 2 / Rule 10). All 6 tracers shipped.

Wins

  • Bundled recipes get room to teach. SQL stays as .sql (editor highlighting, sqlite3 .read works); .md body holds the long-form "when to use" / "follow-up SQL" notes. Per-row actions templates lifted from the in-code map into YAML frontmatter on the .md — uniform shape across bundled + project.
  • Project teams ship internal SQL without forking: drop <id>.sql into .codemap/recipes/, every team member runs codemap query --recipe internal-flaky-tests. Git-tracked source code, code-reviewable in PRs.
  • MCP discovery auto-inherits. codemap://recipes carries source: "bundled" | "project" on every entry + shadows: true on project entries that override bundled. Agents reading the catalog at session start see the override before they call the tool.

Settled grill questions

# Question Answer
Q-A Bundled storage layout File-pair <id>.{sql,md} (uniform with project; .sql syntax highlighting; single loader code path)
Q-B Loading time Eager at startup (sub-millisecond cost; "registry always populated")
Q-C Project recipe discovery Root-only — <projectRoot>/.codemap/recipes/ (consistent with .codemap.db resolution; walk-up is forward-compatible if a consumer asks)
Q-D Project recipe actions YAML frontmatter on <id>.md; hand-rolled parser (~50 LOC, no js-yaml dep)
Q-E Conflict resolution Silent at runtime + shadows: true flag at discovery + agent-skill prompt update (per-execution shape unchanged for plan §4 uniformity)
Q-F Validation strictness Both — load-time DML/DDL deny-list (recipe-aware UX) + runtime PRAGMA query_only=1 backstop (parser-proof safety net from PR #35)

Tracer-bullet sequence (all shipped)

  • 1 — loader scaffoldsrc/application/recipes-loader.ts (pure transport-agnostic) + 15 tests
  • 2 — migrate bundled recipes — 12 entries × .{sql,md} files in templates/recipes/; query-recipes.ts becomes a thin Proxy shim that preserves backwards-compat (QUERY_RECIPES, getQueryRecipeSql, etc.)
  • 3 — project-local loader.codemap/recipes/<id>.sql discovery + sibling .md + shadow detection
  • 4 — catalog payload extension--recipes-json and MCP resources gain body / source / shadows fields
  • 5 — YAML frontmatter parser + DML/DDL deny-list — bundled BUNDLED_RECIPE_ACTIONS map deleted; uniform shape end-to-end
  • 6 — docs + agents update — architecture / glossary / README / rule + skill across .agents/ and templates/agents/ (Rule 10), plan deleted (Rule 2), minor changeset

Architecture

Mirrors the cmd-* ↔ *-engine seam from PRs #33 / #35:

  • src/application/recipes-loader.ts — pure loader (file-pair reading, frontmatter parsing, validation, merging with shadow detection). No CLI / runtime dependency.
  • src/cli/query-recipes.ts — thin shim. Caches loader output keyed by (bundledDir, projectDir) so multi-root sessions re-resolve cleanly. Preserves legacy QUERY_RECIPES Proxy + getQueryRecipeSql / getQueryRecipeActions / listQueryRecipeIds / listQueryRecipeCatalog exports for callers.
  • src/application/mcp-server.tscodemap://recipes/{id} resource now uses getQueryRecipeCatalogEntry() so per-id payload includes body / source / shadows.

Test plan

  • bun run check green (typecheck + lint + format + tests + benchmark)
  • 54 unit tests across recipes-loader.test.ts (43) + query-recipes.test.ts (11) — covers every grill decision: file-pair loading, sibling-md pairing, shadow detection, DML/DDL deny-list (every keyword + case-insensitive + comment-stripping), YAML frontmatter (no-frontmatter / type-only / type+description / auto_fixable boolean / multiple items / no-actions-key / malformed), root-only resolution, eager cache invalidation across roots
  • Smoke tested: bun src/index.ts query --recipes-json and --print-sql fan-out against the new file-based registry — output unchanged
  • CI green on every push

Self-audit

  • .agents/ rules respected (tracer-bullets, verify-after-each-step, concise-comments — backticks-in-comments lessons re-learned twice and already in .agents/lessons.md)
  • ✅ Performance: eager-cached, projectDir-keyed; no new deps; hand-rolled YAML scoped to actions only
  • ✅ Architecture: clean engine seam; backwards-compat Proxy shim
  • Gitignore verified via git check-ignore: .codemap/recipes/ is not matched by the existing .codemap.* literal-dot pattern; recipes are git-tracked source by default
  • Defence in depth on safety: load-time lexical check + runtime PRAGMA query_only=1 backstop

Composition with shipped surface

Reuses every bundled-recipe entry-point from PRs #26 / #28 / #30 / #33 / #35. The MCP resources auto-inherit project recipes since they call listQueryRecipeCatalog() (which becomes the loader). No new MCP tools, no new CLI flags.

Summary by CodeRabbit

  • New Features
    • Create project-local recipes in .codemap/recipes/ (SQL + optional Markdown actions/body)
    • Project recipes override bundled ones; catalog marks overrides with shadows and reports recipe source and body
    • Catalog/CLI now exposes enriched recipe metadata (body, source, shadows) and supports --recipes-json output
  • Documentation
    • Added docs and README examples for creating, overriding, and git-tracking recipes
  • Tests
    • Comprehensive tests for loading, merging, frontmatter parsing, and SQL safety validation

Pair bundled recipes with sibling .md (when-to-use / follow-up SQL); enable project-local recipes via .codemap/recipes/<id>.{sql,md}; auto-inherit into the codemap://recipes / recipes/{id} MCP resources shipped in PR #35.

Plan covers storage layout (file-pair vs YAML-frontmatter — file-pair wins per editor + LSP support), loader contract (eager + cached, pure transport-agnostic engine in src/application/recipes-loader.ts), CLI surface (zero new flags — same shape; --recipes-json gains source + body fields), and a 6-commit tracer-bullet sequence.

6 open questions worth a grill round before code: bundled storage layout, loading time (eager vs lazy), monorepo discovery walk-up, actions for project recipes (skip / frontmatter / sibling .json), conflict resolution noise level, and load-time DML/DDL rejection. Status: design pass; not yet implemented.
@coderabbitai

coderabbitai Bot commented May 1, 2026

Copy link
Copy Markdown

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5c527625-ea1f-4b83-9d7a-eb0d48ad62c5

📥 Commits

Reviewing files that changed from the base of the PR and between 126a518 and e71dddc.

📒 Files selected for processing (11)
  • .agents/rules/codemap.md
  • .agents/skills/codemap/SKILL.md
  • README.md
  • docs/glossary.md
  • src/application/recipes-loader.test.ts
  • src/application/recipes-loader.ts
  • templates/agents/rules/codemap.md
  • templates/agents/skills/codemap/SKILL.md
  • templates/recipes/deprecated-symbols.md
  • templates/recipes/fan-in.md
  • templates/recipes/fan-out.md
✅ Files skipped from review due to trivial changes (7)
  • templates/agents/rules/codemap.md
  • .agents/rules/codemap.md
  • docs/glossary.md
  • templates/recipes/fan-in.md
  • README.md
  • src/application/recipes-loader.test.ts
  • templates/agents/skills/codemap/SKILL.md
🚧 Files skipped from review as they are similar to previous changes (4)
  • .agents/skills/codemap/SKILL.md
  • templates/recipes/deprecated-symbols.md
  • templates/recipes/fan-out.md
  • src/application/recipes-loader.ts

📝 Walkthrough

Walkthrough

Adds a transport-agnostic recipes loader and runtime shim that loads bundled templates/recipes/*.sql (+ optional *.md frontmatter), auto-discovers project-local <projectRoot>/.codemap/recipes/*.sql (optional *.md), validates SQL at load time (deny-list for DML/DDL), and surfaces enriched catalog entries with body, source, and shadows metadata.

Changes

Recipes-as-Content Registry & Project-Local Overrides

Layer / File(s) Summary
Data Shape
src/application/recipes-loader.ts
Adds RecipeAction, LoadedRecipe (includes source and shadows), and LoadRecipesOpts interfaces.
Core Recipe Loader
src/application/recipes-loader.ts
Implements loadAllRecipes, readRecipesFromDir, mergeRecipes, validateRecipeSql, extractFrontmatterAndBody. Handles .sql discovery, optional .md YAML-frontmatter actions parsing (restricted), empty-SQL rejection, DML/DDL deny-list, deterministic sorting, and project-over-bundled shadowing.
Loader Tests
src/application/recipes-loader.test.ts
Comprehensive tests for read/merge/load/validation/frontmatter behaviors and integration of .md actions/description.
CLI Runtime Wiring
src/cli/query-recipes.ts
Replaces in-file QUERY_RECIPES with a lazily-loaded registry (cached by project root) backed by loadAllRecipes. Adds resolveBundledRecipesDir, resolveProjectRecipesDir, _resetRecipesCacheForTests, expands QueryRecipeCatalogEntry to include body?, source, shadows?, and adds getQueryRecipeCatalogEntry. Re-exports RecipeAction from loader.
CLI Runtime Tests
src/cli/query-recipes.test.ts
Tests directory resolution, runtime shim behavior (bundled vs project overrides), catalog source/body/shadows fields, and single-entry lookup.
MCP Transport Wiring
src/application/mcp-server.ts
Switches MCP resource handlers to use listQueryRecipeCatalog() and getQueryRecipeCatalogEntry(id); throws for unknown recipe IDs and surfaces source/shadows/body.
Bundled Recipes Content
templates/recipes/*.{sql,md}
Adds/updates multiple SQL recipe templates and optional .md frontmatter documents (e.g., barrel-files, components-by-hooks, deprecated-symbols, fan-in, fan-out, files-hashes, files-largest, index-summary, markers-by-kind, visibility-tags, etc.) where actions live in .md frontmatter.
Docs & Templates
README.md, docs/*, .agents/*, templates/agents/*
Documents project-local recipes, override semantics (shadows: true), frontmatter actions shape, load-time SQL validation, and git-tracking rules (.codemap/recipes/ tracked; .codemap.db gitignored). Removes backlog roadmap item.

Sequence Diagram

sequenceDiagram
    participant CLI as CLI: codemap query
    participant QR as QueryRuntime (src/cli/query-recipes.ts)
    participant RL as RecipesLoader (src/application/recipes-loader.ts)
    participant FS as Filesystem
    participant DB as Query Engine

    CLI->>QR: listQueryRecipeCatalog() / getQueryRecipeSql(id)
    QR->>QR: check cache for projectRoot
    alt cache miss
        QR->>RL: loadAllRecipes({bundledDir, projectDir?})
        RL->>FS: read bundledDir (*.sql / *.md)
        RL->>FS: read projectDir (*.sql / *.md) [if exists]
        RL->>RL: validateRecipeSql (strip comments, deny DML/DDL)
        RL->>RL: extractFrontmatterAndBody (parse actions)
        RL->>RL: mergeRecipes (project shadows bundled)
        RL-->>QR: LoadedRecipe[] (with source/shadows/body)
        QR->>QR: cache registry by projectRoot
    end
    QR->>CLI: catalog entries or SQL
    CLI->>DB: execute recipe SQL (PRAGMA query_only=1 enforced)
    DB-->>CLI: results
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested labels

enhancement, documentation

Poem

🐰
Recipes scattered like seeds in a bed,
SQL and Markdown now both share a shed.
Project recipes whisper, "I shadow the rest,"
Safe queries first — then the agents do the rest. 🌱

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 65.38% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and accurately describes the main change: implementing a recipes-as-content registry with bundled Markdown files and project-local recipes support.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/recipes-content-registry

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 0/1 reviews remaining, refill in 60 minutes.

Comment @coderabbitai help to get the list of available commands and usage tips.

@changeset-bot

changeset-bot Bot commented May 1, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: e71dddc

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@stainless-code/codemap Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

templates/recipes/<id>.{sql,md} for both bundled and project recipes. One loader code path, SQLite syntax highlighting in every editor, single-file diffs, sqlite3 .read works for ad-hoc testing. Migration is ~15 files; shim layer in cli/query-recipes.ts preserves backwards-compat exports.
~15-20 small file reads is sub-millisecond on warm SSD; rounding error vs node/bun startup. 'Registry is always populated' kills lazy guards across three call sites; surfaces malformed-recipe errors early. Rejected disk cache — over-engineered for static SQL strings.
…tion + settle Q-C (root-only)

Folds two grill-round insights into the plan:

1. § 3.2 gains a 'Gitignore note (verified, not assumed)' paragraph — git check-ignore confirmed .codemap/recipes/ is NOT matched by the existing .codemap.* literal-dot pattern. Project recipes are checked into git by default, intended behavior. Consumer-side risk (their own .gitignore using .codemap*) is documented; agent rule + skill will recommend !.codemap/recipes/ un-ignore.

2. New § 3.3 'Why filesystem and not .codemap.db' captures the side-by-side test against query_baselines (which IS in DB, opposite call): nature (output vs input), index-state coupling, human-authored-for-review, meaningful-outside-one-DB. Records the 'send a recipe to a colleague' deciding test (file: send the .sql; DB: reinvent files via export/import). Bundled-recipes-in-npm-package angle reinforces.

3. Q-C settled: root-only (<projectRoot>/.codemap/recipes/). Walk-up would make recipes the only codemap piece resolving differently from .codemap.db / indexer / resolver. Forward-compatible: root-only-→-walk-up is non-breaking; the reverse would be.
…actions

Hand-rolled parser (~30 LOC) handles only the shallow shape codemap needs (key/list/string/bool). Frontmatter co-locates the action with its prose. Project recipes feel first-class with the same actions template surface bundled recipes have. Rejected gray-matter / js-yaml (~50KB for full YAML 1.2 spec we don't need) and sibling .actions.json (wrong factoring — separates action from explanation).
…ll prompt update

Three-layer answer optimised for agent DX + traceability:

1. Silent at runtime (matches user-code-wins convention).
2. shadows: true flag in catalog responses (--recipes-json, codemap://recipes, codemap://recipes/{id}) — discovery-time provenance.
3. Bundled skill prompt instructs agents to read codemap://recipes at session start + check shadows.

Per-execution response shape stays unchanged (preserves plan § 4 uniformity). Stderr warnings rejected (MCP-stderr logs don't surface to the model anyway). --allow-shadow flag rejected (hostile to legitimate override case). Loader cost: ~5 LOC for the shadow check.
…ckstop

Defence in depth: lexical scan rejects DML/DDL at load with recipe-aware error UX (fires in CI / pre-commit). PRAGMA query_only runtime backstop from PR #35 stays as the parser-proof safety net for what lexical can't catch (WITH clauses, multi-statement, attached DBs).

All 6 grill questions now settled — ready for tracer 1.
Pure transport-agnostic loader in src/application/recipes-loader.ts (mirrors the cmd-* ↔ *-engine seam from PR #33). Scope per plan §8 Tracer 1:

- LoadedRecipe interface (canonical shape; bundled + project share it)
- RecipeAction interface lifted from cli/query-recipes.ts (will become the canonical home; query-recipes becomes a shim in Tracer 2)
- readRecipesFromDir(dir, source) — reads <id>.sql, pairs with optional <id>.md (description = first non-empty line, body = full text). Returns [] for missing/non-directory paths (project-recipes case where .codemap/recipes/ is absent — not an error). Throws on empty SQL with recipe-aware message
- mergeRecipes(bundled, project) — project wins on id collision; sets shadows: true on overriding entries (Q-E settled). Output sorted by id (deterministic catalog order)
- loadAllRecipes({bundledDir, projectDir}) — Tracer 1 wires bundled only; projectDir argument accepted but stubbed (returns []). Tracer 3 plugs project loader

15 unit tests cover: missing dir, non-.sql ignore, sql-only loading, sibling-md pairing, heading-strip in description, deterministic id order, empty-sql rejection, comments-then-sql happy path, non-directory passthrough, all 4 merge cases (project-only / bundled-only / shadow / no-overlap), Tracer 1 stub behavior.

Layer note: query-recipes.ts (cli/) still owns QUERY_RECIPES + getQueryRecipeSql / getQueryRecipeActions / listQueryRecipeCatalog / listQueryRecipeIds. Tracer 2 migrates them to call into this loader.
…,md} (Tracer 2 of 6)

QUERY_RECIPES TypeScript object map → templates/recipes/<id>.sql + sibling .md description files. cli/query-recipes.ts becomes a thin shim that calls loadAllRecipes() at first access and caches the result.

12 bundled recipes migrated: fan-out, fan-out-sample, fan-out-sample-json, fan-in, index-summary, files-largest, components-by-hooks, markers-by-kind, deprecated-symbols, visibility-tags, files-hashes, barrel-files. Each gets a .sql file (verbatim) + .md (description body — first non-empty line becomes the catalog 'description').

Backwards-compat preserved:
- QUERY_RECIPES exported as a Proxy so callers (cmd-query.ts, mcp-server.ts) can still use the legacy object-shape access (QUERY_RECIPES['fan-out'].description, Object.keys(QUERY_RECIPES), etc.) without changes
- getQueryRecipeSql / getQueryRecipeActions / listQueryRecipeIds / listQueryRecipeCatalog all derive from the registry — same return shapes
- Smoke tested: bun src/index.ts query --recipes-json + query --print-sql fan-out + query-golden all green

Bundled recipe actions stay in code (BUNDLED_RECIPE_ACTIONS map) through Tracer 5 — that tracer adds the YAML frontmatter parser and lifts these into the .md files alongside descriptions, completing the migration.

New: resolveBundledRecipesDir() in cli/query-recipes.ts mirrors resolveAgentsTemplateDir()'s npm-package layout (templates/recipes/ next to templates/agents/). _resetRecipesCacheForTests() escape hatch added for fixture swaps.

templates/ already shipped in the npm artifact (per package.json files); templates/recipes/ inherits.

Tracer 1's loader now has a real consumer; Tracer 3 will plug in projectDir for .codemap/recipes/<id>.sql discovery.
…acer 3 of 6)

Wires up the actually-new user-facing capability per plan §1: teams ship internal SQL recipes via git-tracked .codemap/recipes/<id>.sql files.

Three pieces:

1. loadAllRecipes now reads opts.projectDir (was stubbed in Tracer 1). Composes via mergeRecipes — project wins on id collision with shadows: true flag (per Q-E settled).

2. resolveProjectRecipesDir(projectRoot) — root-only resolution per Q-C (no walk-up). Returns undefined if .codemap/recipes/ is missing or is a file rather than a directory; absence is not an error.

3. cli/query-recipes.ts shim's getRegistry() now resolves projectDir via getProjectRoot() (falls back to bundled-only if initCodemap hasn't run — covers direct unit-test paths). Cache key includes projectDir so multi-root sessions (test fixtures) re-resolve cleanly. _resetRecipesCacheForTests clears both halves.

5 new loader-engine tests: bundled-only / bundled+project / shadow detection / sorted ordering / missing-dir. 7 new shim tests: 3 for resolveProjectRecipesDir (absent / present / file-not-dir) + 4 for the end-to-end shim path (bundled-only baseline / project-local id surfaces / project shadows bundled / catalog merging).

Project recipes get actions: undefined through Tracer 5 — that tracer adds the YAML frontmatter parser.
…r 4 of 6)

Extends QueryRecipeCatalogEntry with three new additive fields:

- body — full Markdown body of sibling <id>.md (description = first non-empty line; body = long-form 'when to use' / 'follow-up SQL' content)
- source — 'bundled' | 'project' (provenance discriminator)
- shadows — true ONLY on project entries that override a bundled recipe of the same id (per Q-E settled — agents check this at session start to know when a recipe behaves differently from the documented bundled version)

All additive: existing callers that destructure {id, description, sql, actions?} keep working unchanged.

New helper: getQueryRecipeCatalogEntry(id) — same shape as listQueryRecipeCatalog entries, for one id (undefined for unknown). Used by codemap://recipes/{id} MCP resource so the per-id payload includes the same provenance fields the full catalog has.

MCP server changes:
- codemap://recipes/{id} payload now includes body / source / shadows (replaced the inline {id, description, sql, actions?} construction with JSON.stringify(getQueryRecipeCatalogEntry(id)))
- codemap://recipes list-callback uses listQueryRecipeCatalog() (drops dependency on the legacy QUERY_RECIPES Proxy access)
- Resource description updated to 'Single recipe by id: {id, description, body?, sql, actions?, source, shadows?}'
- Removed unused listQueryRecipeIds + QUERY_RECIPES imports

5 new shim tests: bundled.source, bundled.body presence, project.source, project.shadows=true on override, getQueryRecipeCatalogEntry parity + unknown-id-undefined.

Tracer 5 next: YAML frontmatter parser for project-recipe actions + load-time DML/DDL lexical check.
… (Tracer 5 of 6)

Closes the Q-D + Q-F open questions from the grill round.

Q-D — actions for project-local recipes:
- Hand-rolled YAML frontmatter parser in extractFrontmatterAndBody (~30 LOC core, ~50 LOC including helpers). Strict shape: one optional 'actions' list of {type, auto_fixable?, description?} between --- delimiters at the top of <id>.md. Other top-level keys tolerated (forward-compat for future recipe metadata). Unknown action keys silently ignored. Items missing 'type' are filtered out (defensive).
- Lifted the 6 bundled recipes' actions (fan-out, fan-in, files-largest, deprecated-symbols, visibility-tags, barrel-files) from BUNDLED_RECIPE_ACTIONS in cli/query-recipes.ts into YAML frontmatter on each templates/recipes/<id>.md. The map is gone — uniform shape for both bundled and project recipes (Q-A's promised 'one storage shape, one loader code path').

Q-F — load-time DML/DDL lexical check:
- validateRecipeSql exported from recipes-loader. Strips -- comments, finds first identifier-shaped token, rejects if in deny-list (INSERT/UPDATE/DELETE/DROP/CREATE/ALTER/ATTACH/DETACH/REPLACE/TRUNCATE/VACUUM/PRAGMA). Recipe-aware error message points at --save-baseline as the legitimate path for capturing rows.
- Runtime PRAGMA query_only=1 backstop from PR #35 stays unchanged — different jobs: lexical = good UX for common mistakes; backstop = correctness for what slips by.

Lessons re-learned (already in .agents/lessons.md): backticks containing colons in line/block comments break Bun's parser; /* */ inside backticks closes the surrounding /** */ JSDoc. Avoided both by replacing problematic backticks with plain quotes / parentheses.

Tests: 27 new — 13 for validateRecipeSql, 7 for extractFrontmatterAndBody, 1 integration confirming actions + description both populate from a single .md. Total now 54 pass on the loader + shim test files.
…r 6 of 6)

Lifts canonical bits out of docs/plans/recipes-content-registry.md per docs/README.md Rule 2 (delete plans on ship). Surfaces touched:

- architecture.md § CLI usage gains a 'Recipes wiring' paragraph documenting the recipes-loader.ts ↔ query-recipes.ts seam, file-pair storage layout (templates/recipes/ for bundled, .codemap/recipes/ for project), shadow flag + load-time DML/DDL validation, and the .codemap.db-vs-.codemap/recipes/ gitignore asymmetry.

- glossary.md § R: 'recipe' definition expanded to disambiguate bundled vs project sources, surface the actions-via-frontmatter shape, validation, and runtime backstop. New entry 'recipe shadows' covering the override discovery pattern.

- roadmap.md: removed the recipes-as-content-registry backlog entry (now shipped).

- README.md CLI block: added a project-recipes example showing mkdir + echo + --recipe lookup; mentions the shadows discovery field.

- .agents/rules/codemap.md + templates/agents/rules/codemap.md (mirrored per Rule 10): new 'Project-local recipes' bullet right after Recipe actions covers the .codemap/recipes/ location, shadows: true catalog flag, YAML frontmatter shape, and load-time DML/DDL rejection.

- .agents/skills/codemap/SKILL.md + templates/agents/skills/codemap/SKILL.md (mirrored): codemap://recipes resource description gains the 'check shadows at session start' guidance + source/shadows fields; codemap://recipes/{id} payload shape extended to {id, description, body?, sql, actions?, source, shadows?}; new 'Project-local recipes' bullet in the recipe section gives agents the full reference.

- docs/plans/recipes-content-registry.md DELETED (Rule 2 — plan content fully lifted into architecture.md / glossary.md / agent files).

- Minor changeset added (additive features, no schema breaks).
@SutuSebastian SutuSebastian marked this pull request as ready for review May 2, 2026 07:18
@SutuSebastian

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented May 2, 2026

Copy link
Copy Markdown
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/application/mcp-server.ts (1)

202-204: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Update MCP descriptions to include project-local recipes.

Both descriptions still say “bundled” even though this PR now serves bundled + project entries. This can mislead host UIs and tool introspection output.

Suggested wording patch
- "Run a bundled SQL recipe by id. Output rows carry per-row `actions` hints ...
+ "Run a cataloged SQL recipe by id (bundled or project-local). Output rows carry per-row `actions` hints ...

- "Bundled SQL recipes catalog (id, description, sql, optional per-row actions). Same payload as `codemap query --recipes-json`."
+ "SQL recipes catalog (bundled + project-local) with id, description, sql, optional actions, and provenance fields. Same payload as `codemap query --recipes-json`."

Also applies to: 587-589

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/application/mcp-server.ts` around lines 202 - 204, Update the user-facing
description strings that currently say “bundled” to indicate the server serves
both bundled and project-local recipes; specifically edit the description
property for the recipe-running RPC (the string mentioning "Run a bundled SQL
recipe by id" and the reference to "codemap://recipes") and the other similar
description later in the file so they read e.g. "Run a bundled and project-local
SQL recipe by id" (or "bundled + project-local") and include the same
clarification where the other description appears (the second description block
that mirrors the first).
🧹 Nitpick comments (1)
.agents/rules/codemap.md (1)

32-32: ⚡ Quick win

Add a concrete YAML frontmatter example for actions.
This line describes the shape, but a copy-pasteable example would make project recipe config unambiguous.

Proposed doc patch
 **Project-local recipes:** drop `<id>.sql` (and optional `<id>.md` for description + actions) into **`<projectRoot>/.codemap/recipes/`** — auto-discovered, runs via `--recipe <id>` like bundled. Project recipes win on id collision; check `--recipes-json` for **`shadows: true`** entries to know when a project recipe overrides the documented bundled version. `<id>.md` supports YAML frontmatter (`actions: [{type, auto_fixable?, description?}]`) for the per-row action template — same shape as bundled recipes. Validation: SQL is rejected at load time if it starts with DML/DDL (DELETE/DROP/UPDATE/etc.); the runtime `PRAGMA query_only=1` is the parser-proof backstop.
+
+Example `<projectRoot>/.codemap/recipes/fan-out.md`:
+
+```md
+---
+actions:
+  - type: review-coupling
+    description: "Inspect high fan-out files for extraction opportunities."
+---
+Top fan-out files to review first.
+```

As per coding guidelines: **/*.{json,yaml,yml,md}: Document all configuration options with examples.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/rules/codemap.md at line 32, Add a concrete YAML frontmatter example
for the per-recipe `actions` schema referenced in the `.codemap/recipes/` docs:
update the paragraph that currently shows the shape `actions: [{type,
auto_fixable?, description?}]` for `<id>.md` to include a copy-pasteable example
showing at least one action entry (fields: `type`, optional `auto_fixable`, and
`description`) so readers can see exact YAML syntax and placement; reference the
`<id>.md` frontmatter usage and ensure the example aligns with the documented
action types (e.g., `review-coupling`) and clarifies that the example sits
between `---` markers in the recipe markdown.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/glossary.md`:
- Around line 328-334: Update the earlier "Recipe" convention that still
describes recipes as only bundled SQL in src/cli/query-recipes.ts to match the
new global definition in docs/glossary.md and src/application/recipes-loader.ts:
state that recipes are a .sql file with an optional sibling .md (bundled under
templates/recipes/<id>.{sql,md} or project-local under
<projectRoot>/.codemap/recipes/<id>.{sql,md}), note project-local overrides
bundled (shadows: true), that actions live in YAML frontmatter of the .md, and
keep the run CLI hint (codemap query --recipe / -r) and validation/runtime
protections (load-time checks + PRAGMA query_only=1).

In `@README.md`:
- Around line 113-119: The example recipe uses the wrong language value — update
the SQL in the example and the recipe file (.codemap/recipes/big-ts-files.sql
and the README snippet) to use Codemap's indexed language code for TypeScript
(e.g. language = 'ts') instead of 'typescript' so the query will actually match
TypeScript files; ensure the README command example remains the same (codemap
query --recipe big-ts-files) but with the corrected SQL content.

In `@src/application/recipes-loader.ts`:
- Around line 199-215: The current empty-SQL check only strips line comments via
stripLineComments and lets recipes consisting solely of /* ... */ block comments
pass or wrongly trigger deny-list matches; update the logic so block comments
are removed prior to validation: either extend stripLineComments (or create a
new stripComments function) to remove all /* ... */ block comments (multi-line)
as well as -- line comments, then have isEffectivelyEmpty call that function and
ensure validateRecipeSql runs its deny-list checks against the comment-stripped
SQL; reference stripLineComments, isEffectivelyEmpty and validateRecipeSql when
making the change.

In `@templates/agents/rules/codemap.md`:
- Line 39: The documentation for project recipe `<id>.md` frontmatter currently
shows an inline/flow YAML example that could be copied incorrectly; update the
`<id>.md` frontmatter example to show the block-list YAML form that the loader
expects (top-level actions: followed by newline-prefixed - type: entries) and
explicitly state that only the top-level actions: key with list items like -
type: ... (and optional auto_fixable?, description?) is parsed; reference the
loader's expected keys by name (`actions:`, `- type:`, `auto_fixable?`,
`description?`) and add a short note warning that inline flow syntax will be
ignored so authors should use the block list format.

In `@templates/recipes/deprecated-symbols.md`:
- Around line 7-9: The template references a non-existent column `name` in the
example query; update the guidance in deprecated-symbols.md to use the actual
column `callee_name` when instructing agents to find call sites (replace the
suggested `WHERE name = '<symbol>'` with `WHERE callee_name = '<symbol>'` and
ensure the text mentions the `calls` table and `callee_name` explicitly so
agents run valid queries).

In `@templates/recipes/fan-in.md`:
- Line 9: Replace the phrase "most-imported" with "most depended-on" in the
templates/recipes/fan-in.md content (and any other occurrences in that file) so
the description aligns with the dependency-edge data model; update surrounding
sentence to read something like "Files at the top are the most depended-on in
the codebase — changes here ripple through many consumers." and ensure
terminology consistency across the file.

In `@templates/recipes/fan-out.md`:
- Line 9: The sentence uses "import from many other files" which narrows the
metric; update the wording by replacing that phrase with "depend on many other
files" so the recipe describes dependency fan-out accurately, and scan the
surrounding text in the same document for similar occurrences to standardize
terminology to "depend" instead of "import" where appropriate.

---

Outside diff comments:
In `@src/application/mcp-server.ts`:
- Around line 202-204: Update the user-facing description strings that currently
say “bundled” to indicate the server serves both bundled and project-local
recipes; specifically edit the description property for the recipe-running RPC
(the string mentioning "Run a bundled SQL recipe by id" and the reference to
"codemap://recipes") and the other similar description later in the file so they
read e.g. "Run a bundled and project-local SQL recipe by id" (or "bundled +
project-local") and include the same clarification where the other description
appears (the second description block that mirrors the first).

---

Nitpick comments:
In @.agents/rules/codemap.md:
- Line 32: Add a concrete YAML frontmatter example for the per-recipe `actions`
schema referenced in the `.codemap/recipes/` docs: update the paragraph that
currently shows the shape `actions: [{type, auto_fixable?, description?}]` for
`<id>.md` to include a copy-pasteable example showing at least one action entry
(fields: `type`, optional `auto_fixable`, and `description`) so readers can see
exact YAML syntax and placement; reference the `<id>.md` frontmatter usage and
ensure the example aligns with the documented action types (e.g.,
`review-coupling`) and clarifies that the example sits between `---` markers in
the recipe markdown.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e33620f9-0a01-48e8-87da-1268c804ead2

📥 Commits

Reviewing files that changed from the base of the PR and between bb59905 and 126a518.

📒 Files selected for processing (38)
  • .agents/rules/codemap.md
  • .agents/skills/codemap/SKILL.md
  • .changeset/recipes-content-registry.md
  • README.md
  • docs/architecture.md
  • docs/glossary.md
  • docs/roadmap.md
  • src/application/mcp-server.ts
  • src/application/recipes-loader.test.ts
  • src/application/recipes-loader.ts
  • src/cli/query-recipes.test.ts
  • src/cli/query-recipes.ts
  • templates/agents/rules/codemap.md
  • templates/agents/skills/codemap/SKILL.md
  • templates/recipes/barrel-files.md
  • templates/recipes/barrel-files.sql
  • templates/recipes/components-by-hooks.md
  • templates/recipes/components-by-hooks.sql
  • templates/recipes/deprecated-symbols.md
  • templates/recipes/deprecated-symbols.sql
  • templates/recipes/fan-in.md
  • templates/recipes/fan-in.sql
  • templates/recipes/fan-out-sample-json.md
  • templates/recipes/fan-out-sample-json.sql
  • templates/recipes/fan-out-sample.md
  • templates/recipes/fan-out-sample.sql
  • templates/recipes/fan-out.md
  • templates/recipes/fan-out.sql
  • templates/recipes/files-hashes.md
  • templates/recipes/files-hashes.sql
  • templates/recipes/files-largest.md
  • templates/recipes/files-largest.sql
  • templates/recipes/index-summary.md
  • templates/recipes/index-summary.sql
  • templates/recipes/markers-by-kind.md
  • templates/recipes/markers-by-kind.sql
  • templates/recipes/visibility-tags.md
  • templates/recipes/visibility-tags.sql
💤 Files with no reviewable changes (1)
  • docs/roadmap.md

Comment thread docs/glossary.md
Comment thread README.md
Comment thread src/application/recipes-loader.ts
Comment thread templates/agents/rules/codemap.md Outdated
Comment thread templates/recipes/deprecated-symbols.md Outdated
Comment thread templates/recipes/fan-in.md Outdated
Comment thread templates/recipes/fan-out.md Outdated
All 7 verified valid against actual code; all applied.

Major:

- recipes-loader.ts: stripLineComments now strips block /* */ comments
  BEFORE the line-comment + first-keyword scan. Without this, the
  deny-list could be bypassed two ways:
  (1) A leading block comment containing a deny-listed keyword could
      cause the lexer to misclassify a legit SELECT as DDL.
  (2) /* SELECT */ DELETE FROM x would be accepted because the lexer
      saw 'SELECT' in the comment first.
  Now: regex strips block comments first, then line comments, then
  first-identifier match runs. Pure-block-comment files also trip the
  empty-recipe check correctly. The runtime PRAGMA query_only=1
  backstop is still the parser-proof safety net for things like
  string-literal-embedded comments (vanishingly rare). 3 new tests
  cover false-positive avoidance, smuggled-DELETE rejection, and
  pure-block-comment-as-empty.

Minor:

- glossary.md § Conventions: 'Recipe = bundled SQL string in
  src/cli/query-recipes.ts' was outdated (recipes are now files in
  templates/recipes/ + .codemap/recipes/; query-recipes.ts is the
  shim). Reworded with a forward-pointer to § R recipe.

- README.md CLI block: language IN ('ts', 'tsx') instead of
  language = 'typescript'. Verified via codemap query — the indexer
  stores 'ts' / 'tsx' / 'md' / 'json' etc., not the long form. The
  example as written would have returned 0 rows.

- .agents/rules/codemap.md + templates/agents/rules/codemap.md
  + .agents/skills/codemap/SKILL.md + templates/agents/skills/codemap/SKILL.md
  (mirrored 4-way per Rule 10): the YAML frontmatter doc was showing
  inline-flow shape (actions: [{type, ...}]) but the loader's hand-
  rolled parser only accepts block-list (- type:). Authors copying
  inline form would silently lose actions. Replaced with a fenced
  block showing the correct block-list form.

- templates/recipes/deprecated-symbols.md: WHERE name → WHERE
  callee_name. Verified via pragma_table_info — the calls table has
  caller_name + callee_name + caller_scope + file_path + id; no
  bare 'name' column. The recipe doc would have pointed agents at
  an invalid query.

- templates/recipes/fan-in.md + fan-out.md: 'most-imported' /
  'they import from many other files' → 'most depended-on' /
  'they depend on many other files'. The dependencies table
  aggregates static imports + dynamic imports + resolved module-
  graph edges, so the import-only framing was narrower than what
  the metric measures.
@SutuSebastian SutuSebastian merged commit 5110b1a into main May 2, 2026
9 checks passed
@SutuSebastian SutuSebastian deleted the feat/recipes-content-registry branch May 2, 2026 09:09
@github-actions github-actions Bot mentioned this pull request May 1, 2026
SutuSebastian added a commit that referenced this pull request May 2, 2026
… status snapshot (#38)

The Adjacent — also shipped post-refresh block already listed PR #35 (MCP server). Adding PR #37: bundled recipes migrated to templates/recipes/<id>.{sql,md}, project-local recipes via .codemap/recipes/, catalog gains source/body/shadows fields, YAML frontmatter actions, load-time DML/DDL deny-list. Pure docs refresh — no behavior change.
SutuSebastian added a commit that referenced this pull request May 2, 2026
Agent-first: gives data + structured warning; preserves agent autonomy (e.g. 'I want stale to compare with what changed'). Refuse + auto-reindex both rejected — refuse forces 3 round-trips for content already on disk; auto-reindex hides side-effects from a read tool and breaks the read/write separation we kept clean across PRs #33 / #35 / #37.

All 6 grill questions now settled — ready for tracer 1.
SutuSebastian added a commit that referenced this pull request May 2, 2026
* docs(plans): draft targeted-read-cli (codemap show)

One-step CLI verb for 'where is this symbol' — codemap show <symbol> returns file_path:line_start-line_end + signature. Pure ergonomic affordance over SELECT … FROM symbols WHERE name = ?; no schema change.

Plan covers surface (show + --all + --kind + --in flags), wiring (cmd-show.ts + show-engine.ts mirroring cmd-context/cmd-validate), MCP integration via the plan §35 pattern, and a 4-commit tracer-bullet sequence (~half day).

5 open questions worth a grill round before code: MCP tool registration, multiple-match UX (error vs list), exact vs fuzzy matching, file-scope filter, snippet-sibling timing. Status: design pass; not yet implemented.

* docs(plans): settle Q-1 — show ships as a dedicated MCP tool

Mirrors the every-verb-becomes-a-tool pattern from PR #35. Discoverability win matters for agents that don't know the symbols schema; token savings compound. ~25 LOC registration; reuses the engine helper.

* docs(plans): settle Q-2 — always-wrap {matches, disambiguation?} envelope

Agent-first reframing: 'error by default' was 2023-era reasoning; today's frontier models reason fine over 2-5 candidates given context. Always-wrap gives a single shape to learn / document / test, plus forward extensibility for future disambiguation aids (nearest_to_cursor, most_recently_modified, caller_count) without breaking the contract.

Single match: {matches: [{...}]}. Multi-match: {matches: [...], disambiguation: {n, by_kind, files, hint}}. Agent reads result.matches[0] either way.

* docs(plans): settle Q-3 — exact match only; fuzzy stays in query

show contract is sharp: 'I know the name → I want to know where it lives.' Agents have the exact name 95% of the time (stack traces, import statements, prior query results). Error message points at query+LIKE for fuzzy so the agent's next move is explicit. Avoids burning a flag on a feature query already does.

* docs(plans): settle Q-4 — ship --in <path> file-scope filter

Closes the loop with the Q-2 disambiguation envelope: agent sees candidate files in disambiguation.files, narrows with --in via parameter add (not tool-switch to query). --kind handles 'function vs const' ambiguity; --in handles 'this folder vs that folder' (the common case). ~5 LOC. Match rule: prefix if ends with / or names a directory, else exact file.

* docs(plans): expand to show + snippet, settle Q-5, open Q-6, fold fact-check refinements

After fact-checking against the refreshed codemap index, snippet's marginal cost is smaller than initially framed:

- findSymbolsByName (Q-1 helper) is shared with show — free reuse
- readFileSync + toProjectRelative + hashContent + files.content_hash IS the literal pattern cmd-validate.ts already uses for stale detection — pure copy-paste
- ~2-3 hours marginal cost on top of show; splitting into a follow-up PR would duplicate docs / changeset / Rule-10 mirror overhead

Q-5 settled: ship snippet alongside show in v1. Output is {matches: [{...metadata, source, stale?}]} — additive on Q-2's envelope, no shape divergence.

Q-2 updated: explicit requirement that BOTH the CLI's --json mode AND the MCP tool wrap in {matches, disambiguation?} — required to preserve plan §4 uniformity (CLI prints array AND MCP returns envelope = uniformity broken).

Q-4 updated: --in <path> normalization via existing toProjectRelative(projectRoot, p) helper (verified — already handles leading ./, trailing /, Windows backslash → POSIX). No reinventing.

Q-6 opened: stale-file behavior for snippet — read+flag (1) vs refuse (2) vs auto-reindex (3). Bias toward (1) per agent-first lens (no hostile round-trip, no hidden side-effects).

Tracer-bullet sequence expanded from 4 → 6 commits (~1 day total). Non-goals updated: snippet no longer deferred; --with-source flag explicitly rejected per Q-5; auto-reindex on stale explicitly rejected pending Q-6 confirmation; glob characters in --in explicitly out of scope.

* docs(plans): settle Q-6 — read + flag stale snippets

Agent-first: gives data + structured warning; preserves agent autonomy (e.g. 'I want stale to compare with what changed'). Refuse + auto-reindex both rejected — refuse forces 3 round-trips for content already on disk; auto-reindex hides side-effects from a read tool and breaks the read/write separation we kept clean across PRs #33 / #35 / #37.

All 6 grill questions now settled — ready for tracer 1.

* feat(show): show-engine.ts findSymbolsByName + tests (Tracer 1 of 6)

Pure transport-agnostic lookup engine — same shape audit-engine.ts / query-engine.ts use (PRs #33 / #35). findSymbolsByName({db, name, kind?, inPath?}) returns SymbolMatch[] with deterministic order (file_path ASC, line_start ASC) so callers slice for stable disambiguation output.

Per Q-3 settled: name match is case-sensitive (exact). Per Q-4 settled: inPath uses a directory-vs-file heuristic — trailing slash OR no extension in trailing segment treats as prefix (LIKE 'src/cli/%'); else exact file match (file_path = ?). Caller normalizes via toProjectRelative before passing.

12 unit tests cover: single match, unknown name, ambiguous (3-match deterministic order), kind filter narrowing, inPath as directory (no slash + with slash), inPath as file (exact + miss), kind+inPath compose AND, returned columns, case-sensitivity.

Reuses the symbols table directly. No schema change. Tracer 2 wires the CLI verb on top.

* feat(show): codemap show <name> CLI verb (Tracer 2 of 6)

Implements the show CLI verb per the settled grill round:

- parseShowRest — argv parser supporting <name> + --kind + --in + --json (+ --help / -h). Errors on missing name, extra positional, unknown flags, and missing flag values.
- buildShowResult — wraps engine output in the {matches, disambiguation?} envelope (Q-2 settled). Single-match → {matches}; multi-match adds n / by_kind / files / hint structured aids.
- runShowCmd — bootstraps codemap, normalizes --in via toProjectRelative (Q-4), runs findSymbolsByName, renders. JSON mode prints the envelope verbatim; terminal mode prints path:line-line + signature per row + a stderr disambiguation hint on multi-match.
- Error UX (Q-3): unknown name → routed-error message pointing at `codemap query --json "SELECT … LIKE '%name%'"` so the agent's next step is explicit.

Wired into main.ts dispatch + bootstrap.ts validateIndexModeArgs known-verbs list + help text. toProjectRelative exported from cmd-validate.ts (was private).

13 unit tests cover parser (help/missing/extra/unknown-flag/--kind/--in/order-independence/throws-if-not-show) + buildShowResult envelope (single / zero / multi / file dedup).

Smoke tested: show runQueryCmd / --json / --in / unknown-name all behave per spec.

* feat(show): readSymbolSource + getIndexedContentHash with stale detection (Tracer 3 of 6)

Adds the snippet-side engine helpers per Q-5 (ship snippet alongside show) + Q-6 (read + flag stale, never refuse + never auto-reindex):

- readSymbolSource({match, projectRoot, indexedContentHash?}) returns {source, stale, missing}. Reuses readFileSync + hashContent + the same FS pattern cmd-validate.ts uses (verified during fact-check). Line slicing is 1-indexed inclusive matching symbols.line_start/line_end. Clamps line_end past EOF instead of throwing.

- getIndexedContentHash(db, filePath) — convenience helper for the same SELECT cmd-validate.ts uses.

Stale semantics (Q-6): source is ALWAYS returned when the file exists; stale: true is just a metadata flag the agent reads. Missing file → {source: undefined, stale: true, missing: true}. indexedContentHash undefined → never marks stale (caller opts out of staleness checks).

7 new unit tests cover line slicing happy path, missing file, hash-match (stale: false), hash-mismatch (stale: true + source still returned), EOF clamping, opt-out via undefined hash, and getIndexedContentHash lookup. Total now 19 pass on show-engine.

Tracer 4 next: cmd-snippet.ts CLI verb on top of these helpers.

* feat(snippet): codemap snippet <name> CLI verb (Tracer 4 of 6)

Sibling to show: same lookup contract (name + kind + in + json) but returns source text from disk per match. Output envelope: {matches: [{...metadata, source, stale, missing}], disambiguation?: {...}} — additive on Q-2's envelope (one source/stale/missing field per row, never a shape divergence).

- parseSnippetRest mirrors parseShowRest's parser (same flags, same errors).
- buildSnippetResult enriches each SymbolMatch with source/stale/missing via getIndexedContentHash + readSymbolSource (Tracer 3 helpers). Per Q-6: source ALWAYS returned when file exists; stale/missing are pure metadata flags the agent reads.
- runSnippetCmd mirrors runShowCmd's bootstrap + lookup + render. Terminal mode prints path:line-line[STALE/MISSING flags] + source; --json mode emits the envelope verbatim. Stderr hint when any row is stale points at codemap / codemap --files <path> for refresh.

Wired into main.ts dispatch + bootstrap.ts known-verbs + help text.

11 unit tests cover parser (help/missing/extra/unknown/--kind/--in/order/throws-not-snippet) + buildSnippetResult (single match w/ source, stale flag on hash drift, missing flag on rm'd file, multi-match disambiguation envelope).

Smoke tested: bun src/index.ts snippet runQueryCmd --json returns the function source + metadata + stale: false.

* feat(mcp): show + snippet MCP tools (Tracer 5 of 6)

Wires the show + snippet CLI verbs as MCP tools per Q-1 settled. Both follow the established cmd-* ↔ register*Tool pattern from PR #35; both reuse the same engine helpers (findSymbolsByName, buildShowResult, buildSnippetResult) so output shape is verbatim from each tool's CLI counterpart's --json envelope.

- registerShowTool — args {name, kind?, in?}, returns the {matches, disambiguation?} envelope. Tool description teaches: 'Use snippet for source text; use query with LIKE for fuzzy lookup' so agents know when to reach for which tool.

- registerSnippetTool — args {name, kind?, in?}, returns the same envelope with source/stale/missing on each match. Description spells out the stale semantics (read + flag, agent decides) since that's the one non-obvious bit.

Both tools route the in arg through toProjectRelative(opts.root, args.in) so MCP callers get the same path-shape leniency as the CLI (--in ./src/cli/, --in src/cli, --in src/cli/cmd-show.ts all work identically).

8 new in-process MCP tests via @modelcontextprotocol/sdk's InMemoryTransport: tools/list lists both, single-match envelope, multi-match disambiguation, in-filter narrows, unknown-name returns empty, snippet source on fresh file (stale: false), stale flag on hash drift, missing flag on rm'd file.

Total now 38 MCP tests pass.

* docs(show + snippet): architecture / glossary / README / agent rule + skill (Tracer 6 of 6)

Lifts canonical bits out of docs/plans/targeted-read-cli.md per docs/README.md Rule 2 (delete plans on ship). Surfaces touched:

- architecture.md § CLI usage gains a 'Show / snippet wiring' paragraph documenting the cmd-show ↔ cmd-snippet ↔ show-engine seam, the {matches, disambiguation?} envelope, the toProjectRelative + hashContent primitive reuse from cmd-validate.ts, and the stale-file behavior (read + flag, no auto-reindex).

- glossary.md § S: new entries 'show' and 'snippet' with disambiguation envelope reference + cross-link to architecture.md.

- roadmap.md: removed the targeted-read-cli backlog entry (now shipped).

- README.md CLI block: added show + snippet examples covering the metadata vs source-text distinction and the disambiguation envelope shape.

- .agents/rules/codemap.md + templates/agents/rules/codemap.md (mirrored per Rule 10): added two CLI table rows (Targeted read metadata, Targeted read source text) + a 'Targeted reads' section documenting the envelope, --kind / --in flags, exact-match semantics, and snippet stale-file behavior.

- .agents/skills/codemap/SKILL.md + templates/agents/skills/codemap/SKILL.md (mirrored): MCP tools list extended with show + snippet entries describing args, envelope shape, and stale semantics. Tools list in agent rule extended too.

- docs/plans/targeted-read-cli.md DELETED (Rule 2 — plan content fully lifted into architecture / glossary / agent files).

- Minor changeset added (additive features, no schema breaks).

* chore(security): defence-in-depth fixes from PR self-audit

Three small hygiene fixes from the security audit on PR #39:

1. agents-init.ts relPathToAbsSegments — now rejects '..' and '.' segments instead of just filtering empty strings. Defence in depth: today's callers source rel from listRegularFilesRecursive (package-controlled, never produces '..'), but a future caller passing user-provided relative paths would otherwise allow join(destRoot, '..', 'etc', 'passwd') to write outside destRoot. Throws loud instead of silently writing somewhere unexpected. 5 new unit tests cover happy path, empty-segment filter, '..' at start, '..' in middle, and '.' rejection.

2. cmd-show.ts + cmd-snippet.ts unknown-name error — escapes single-quotes (SQLite '' convention) before embedding the user-provided name into the suggested SQL hint. No execution risk (the message is just text), but the previous version emitted SQL like LIKE '%'; DROP TABLE symbols; --%' which looks injection-y in agent traces and breaks if the agent copy-pastes the hint. Now safe for names like O'Brien.

3. .github/workflows/ci.yml — added an audit job running 'bun audit' on every PR. Marked continue-on-error: true (non-blocking) so transient registry issues or low-severity transitive CVEs don't gate merges. Promote to a hard gate once the team agrees on a vulnerability budget. Verified bun audit works locally + reports zero vulnerabilities today.

All three are tiny, additive, and follow defence-in-depth rather than fixing live exploits — the original audit found no exploitable vulnerabilities in the codebase.

* fix(show): escape SQL LIKE wildcards in --in path (PR #39 CodeRabbit feedback, Major)

Real bug verified against actual SQLite semantics: when --in src/__tests__ became LIKE 'src/__tests__/%', the underscores matched ANY single char so the query also matched src/aatestsZZ/foo.ts. Underscores are ubiquitous in TS layouts (__tests__, __mocks__, _utils, _helpers).

Fix: new escapeLikeLiteral helper escapes _, %, and \ (the escape char itself); the LIKE clause now uses ESCAPE '\'. Trailing % we append stays an unescaped wildcard. Symmetric handling so paths with literal '%' (rare but possible in OS file names) also match exactly.

Tests: 1 integration test seeds both src/__tests__/setup.ts and a same-shape decoy src/aatestsZZ/decoy.ts; --in src/__tests__ now returns only the real one. 4 unit tests cover the escape helper (underscore, percent, backslash, identity).
SutuSebastian added a commit that referenced this pull request May 2, 2026
…d items in competitive-scan §4-§5 (#40)

Two stale spots fixed:

1. fallow.md 'Adjacent — also shipped post-refresh' block: added PR #39 (targeted-read CLI: show + snippet + security hygiene fixes).

2. competitive-scan-2026-04.md § 4 'What moved to the roadmap': marked Recipes-as-content registry (PR #37) and Targeted-read CLI (PR #39) as ✅ shipped. Both were sitting as 'still backlog' even though their roadmap entries were removed at ship time. Updated § 5 'Open questions': 'Should recipes own their description?' is settled (PR #37 file-pair shape).

No code change; pure docs honesty.
SutuSebastian added a commit that referenced this pull request May 4, 2026
… recipe

The "fully capable, no half-way APIs" principle reshapes three things:

1. **LCOV ingester ships in v1** alongside Istanbul. Original draft deferred
   LCOV to v1.x, which would exclude `bun test --coverage` users — i.e.
   codemap's own primary runtime. That's the textbook half-baked surface
   the principle bans. Two parser front-ends share one `upsertCoverageRows`
   core; LCOV is regex tokenizing over `SF:` / `DA:` / `end_of_record`.
   Tracer 2 splits into 2a (shared core + Istanbul parser) and 2b (LCOV
   parser), both writing identical normalised CoverageRow[] into the same
   upsert path.

2. **`--source istanbul|lcov` flag dropped.** Auto-detection from extension
   (`.json` → istanbul, `.info` → lcov, directory → probe both, error on
   ambiguous) is unambiguous; a flag for "tell codemap what it can already
   see" is API noise. Misnamed files can be renamed (one-liner) cheaper
   than codemap can grow a flag.

3. **Killer recipe ships as bundled `untested-and-dead.{sql,md}`** in
   `templates/recipes/`. Per the recipes-as-content registry (PR #37), the
   high-value queries become first-class agent surface. A buried doc
   snippet would be invisible to agents at session start; the bundled
   recipe shows up in `--recipes-json` and gets a `codemap query --recipe
   untested-and-dead` direct invocation.

Tracer 4 also fans out: Istanbul + LCOV fixtures cover the same partial
coverage shape; three golden recipes (`coverage-istanbul.json`,
`coverage-lcov.json`, `untested-and-dead.json`) prove format equivalence.
Out-of-scope, alternatives, performance section, title, and goal
statement all updated to match.
SutuSebastian added a commit that referenced this pull request May 4, 2026
…age` table) (#56)

* docs(plan): static coverage ingestion (Istanbul JSON → `coverage` table)

Plans the C.11 candidate from `research/fallow.md` — `codemap ingest-coverage <path>`
reads Istanbul `coverage-final.json` into two new tables (`coverage` symbol-level +
`file_coverage` rollup), joinable to `symbols` for the killer "what's structurally
dead AND untested?" recipe in one query.

Resolves the open question from `fallow.md § 6` ("symbols column vs separate table?")
in favour of a separate table with `ON DELETE CASCADE` (D1) — coverage shape evolves
independently of structural columns; LEFT JOIN keeps NULL semantics explicit; rows
survive `--full` reindex via the `query_baselines` precedent (D6).

Key decisions:
- Istanbul JSON in v1; LCOV in v1.x; raw V8 traces never (D3, fallow's paid moat).
- One-shot `ingest-coverage` verb decoupled from `codemap` index runs (D4) — coverage
  cadence (per `bun test --coverage`) ≠ index cadence (per file edit).
- Statement coverage only in v1 (D5); branch/function deferred until a consumer asks.
- MCP/HTTP exposure as a query column, not a separate `coverage` tool (D9) — composes
  with every existing recipe + ad-hoc SQL.
- `codemap audit --delta coverage` deferred to v1.x (D10) — raw schema first.

Five-tracer plan: schema bump → engine → CLI verb → fixture + golden recipe → docs.
Plan only — implementation follows after CodeRabbit review per the established
workflow (PRs #46/47, #49/50, #51/52, #53/54).

* docs(plan): fact-check fixes — drop hallucinated SQL/projection/runner claims

Self-audit against the actual codebase surfaced four claims that didn't hold:

1. Killer recipe SQL referenced `callee_id` — `calls` is name-keyed
   (`callee_name TEXT`, no symbol-id FK; see `db.ts` `CallRow`). Rewrote
   the "no callers" predicate as `NOT EXISTS (… WHERE callee_name = s.name)`.
2. D7 claimed line-range projection is "the same `markers` already uses" —
   `markers` is line-pinned (`line_number INTEGER`), no projection.
   Reworded as "novel for this plan" with the actual mechanic spelled out.
3. D3 listed `bun test --coverage` as an Istanbul JSON emitter — `bun test
   --help` shows only `text` / `lcov` reporters today. Removed bun from the
   Istanbul-emitters list; left vitest/jest/c8/nyc with the explicit reporter
   flags they need.
4. D12 contradicted D6 ("rows absent until re-ingest" vs "rows survive
   `--full`"). Reconciled: empty is the correct initial state on first bump;
   subsequent bumps preserve via the `dropAll()` exclusion. Quoted the
   `lessons.md` policy verbatim instead of paraphrasing.

* docs(plan): v2 — fix CASCADE hazard + innermost-wins projection + nits

Self-grilling found two real schema design holes that would block execution:

1. **D6 CASCADE hazard.** Original draft keyed `coverage` on
   `symbol_id REFERENCES symbols(id) ON DELETE CASCADE`. Every `--full`
   reindex calls `dropAll()` → drops `symbols` → CASCADE wipes coverage,
   regardless of whether `coverage` itself was excluded from `dropAll()`.
   Recreated `symbols` get fresh auto-increment IDs anyway → coverage
   permanently lost without re-ingest. Fix: natural-key PK
   `(file_path, name, line_start)` — no FK to `symbols.id`. Survives the
   `symbols` drop-recreate cycle. Trade-off: orphan rows when files are
   deleted; cleaned by one explicit `DELETE FROM coverage WHERE file_path
   NOT IN (SELECT path FROM files)` after every ingest.

2. **D7 overlapping symbols.** Original draft: `line_start ≤ stmt_line ≤
   line_end` matches every enclosing scope. With nested symbols (class
   methods inside classes, closures inside functions), one Istanbul
   statement projects onto 3+ symbols, inflating `total_statements` 2-3×.
   Fix: innermost-wins via `(line_end - line_start) ASC LIMIT 1`. New
   `skipped.statements_no_symbol` counter for statements that fall outside
   every symbol range (top-level expressions, side-effect imports).

Nits cleared in the same pass:

- D2: drop `file_coverage` rollup table from v1 (aggregateable via
  GROUP BY on the symbol-level table; doubling sources of truth without
  a benchmark is premature). Promote to v1.x with a real query.
- D11: spec the `total_statements = 0 → coverage_pct IS NULL` edge case
  + document the cross-file name-collision lossiness in the killer recipe.
- Drop `--prune` flag (orphan cleanup is unconditional, no flag needed).
- Drop per-row `source` column (single meta key sufficient; one ingest
  at a time).
- Update killer recipe SQL to use the natural-key 3-column join.
- Drop made-up "~50 LoC LCOV ingester" estimate and "<50 ms / <1 ms /
  ~500 KB" performance numbers (no benchmark backed them).
- Tracer 1 / 2 / 3 acceptance criteria updated to match the new schema.

Plan is now ready for tracer-1 implementation. CodeRabbit pass deferred
(rate-limited 57m).

* docs(plan): tighten Bun-native API references (file read + perf note)

Plan correctly inherits the established Node vs Bun runtime split, but the
single tracer-3 reference understated it. Now:

- Tracer 3 cites `packaging.md § Node vs Bun` as the canonical pattern
  source instead of pointing at config.ts in passing.
- Performance section calls out the actual lever — `Bun.file(path).json()`
  uses Bun's native JSON parser, materially faster than V8 `JSON.parse`
  on multi-MB Istanbul payloads (real coverage files for medium codebases
  routinely hit several MB).

No new Bun-native API surfaces are added — the feature doesn't need
globbing, file writes, spawn, or hashing beyond what the existing engines
already use through their abstractions.

* docs(plan): v3 — ship LCOV in v1 + drop --source flag + bundle killer recipe

The "fully capable, no half-way APIs" principle reshapes three things:

1. **LCOV ingester ships in v1** alongside Istanbul. Original draft deferred
   LCOV to v1.x, which would exclude `bun test --coverage` users — i.e.
   codemap's own primary runtime. That's the textbook half-baked surface
   the principle bans. Two parser front-ends share one `upsertCoverageRows`
   core; LCOV is regex tokenizing over `SF:` / `DA:` / `end_of_record`.
   Tracer 2 splits into 2a (shared core + Istanbul parser) and 2b (LCOV
   parser), both writing identical normalised CoverageRow[] into the same
   upsert path.

2. **`--source istanbul|lcov` flag dropped.** Auto-detection from extension
   (`.json` → istanbul, `.info` → lcov, directory → probe both, error on
   ambiguous) is unambiguous; a flag for "tell codemap what it can already
   see" is API noise. Misnamed files can be renamed (one-liner) cheaper
   than codemap can grow a flag.

3. **Killer recipe ships as bundled `untested-and-dead.{sql,md}`** in
   `templates/recipes/`. Per the recipes-as-content registry (PR #37), the
   high-value queries become first-class agent surface. A buried doc
   snippet would be invisible to agents at session start; the bundled
   recipe shows up in `--recipes-json` and gets a `codemap query --recipe
   untested-and-dead` direct invocation.

Tracer 4 also fans out: Istanbul + LCOV fixtures cover the same partial
coverage shape; three golden recipes (`coverage-istanbul.json`,
`coverage-lcov.json`, `untested-and-dead.json`) prove format equivalence.
Out-of-scope, alternatives, performance section, title, and goal
statement all updated to match.

* docs(plan): v4 — agent-journey audit + bundled recipe shelf (D13)

Walked every D / OOS / tracer item against "fully capable + agent
first-class + no half-baked APIs". Found three half-baked surfaces:

1. **D2 deferral leaks "compose GROUP BY yourself" onto the agent.**
   Deferring the `file_coverage` table is correct (no benchmark proves
   it's needed) — but the agent-facing answer for "rank files by
   coverage" was missing. Fix: keep table deferral, ship a bundled
   `files-by-coverage.{sql,md}` recipe so the GROUP BY view IS
   first-class.

2. **D11 name-collision lossiness was acknowledged but unmitigated.**
   The killer recipe's `callee_name = s.name` cross-file lossiness
   was documented in the recipe SQL comment, but the recipe `.md`
   didn't give the agent any narrowing pattern. Now D11 ships three
   concrete narrowing patterns in the `.md` (file_path scope, default-
   export filter, exported-only restriction) so the agent has
   workable mitigations on day one.

3. **Missing recipe shelf for common agent questions.** Walking the
   journey: only "What's structurally dead AND untested?" had a recipe;
   "Rank files by coverage" and "Worst-covered exported symbols" forced
   ad-hoc SQL. Three recipes fully cover the agent journey end-to-end.

New D13 codifies the bundled-recipe principle: every common agent
question gets a `--recipe` verb. Three v1 recipes:
- `untested-and-dead.{sql,md}` (killer, with name-collision mitigations)
- `files-by-coverage.{sql,md}` (replaces D2's table deferral)
- `worst-covered-exports.{sql,md}` (top-N agent ask)

Each `.md` carries a frontmatter `actions` block (per PR #26) so agents
get per-row follow-up hints. All three appear in `--recipes-json`
automatically — agents discover them at session start.

New "Agent journey" section makes the principle visible: a table mapping
every common agent question to the v1 verb that answers it. If a row
ever shows "compose SQL yourself" without a recipe, the surface is
half-baked and needs a recipe before tracer 1 ships.

Tracer 4 expanded: ships all three recipes + five golden snapshots
(adds files-by-coverage.json + worst-covered-exports.json on top of the
three existing). Tracer 5 expanded: glossary + agent rule trigger
table gain three new rows.

Plan now passes the principle audit end-to-end.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant